Chapter 19:

Poisson regression, where the outcome is the number of events that occur in an interval of time

Nonlinear least-squares regression, where the relationship between the predictors and numerical

outcome can be more complicated than a simple summation of terms in a linear model

LOWESS curve-fitting, where you fit a custom function to describe your data

Finally, Part 5 ends with Chapter 20, which provides guidance on the mechanics of regression

modeling, including how to develop a modeling plan, and how to choose variables to include in

models.

A Matter of Life and Death: Working with

Survival Data

Sooner or later, everyone dies, and in biological research, it becomes especially important to

characterize that sooner-or-later part as accurately as possible using survival analysis techniques. But

characterizing survival can get tricky. It’s possible to say that patients may live an average of 5.3 years

after they are diagnosed with a particular disease. But what is the exact survival experience? Imagine

you do a study with patients who have this disease. You may ask: Do all patients tend to live around

five or six years, or do half the patients die within the first few months, and the other half survive ten

years or more? And what if some patients live longer than the observational period of your study?

How do you include them in your analysis? And what about participants who stopped returning calls

from your study staff? You do not know if these dropouts went on to live or die. How do you include

their data in your analysis?

The need to study survival with data like these led to the development of survival analysis

techniques. But survival analysis is not only intended to study the outcome of death. You can use

survival analysis to study the time to the first occurrence of non-death events as well, like

remission or recurrence of cancer, the diagnosis of a particular condition, or the resolution of a

particular condition. Survival analysis techniques are presented in Part 6.

Getting to Know Statistical Distributions

Statistics books always contain tables, so why should this one be any different? Back in the not-so-

good old days, when analysts had to do statistical calculations by hand, they needed to use tables of the

common statistical distributions to complete the calculation of the significance test. They needed tables

for the normal distribution, Student t, chi-square, Fisher F, and others. Now, software does all this for

you, including calculating exact p values, so these printed tables aren’t necessary anymore.

But you should still be familiar with the common statistical distributions that may describe the

fluctuations in your data, or that may be referenced in the course of performing a statistical calculation.

Chapter 24 contains a list of commonly used distribution functions, with explanations of where you can